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TECHNICAL FIELD 

5 .--V 

The present invention relates to a system and method for measuring and/or 
analysing usage of resources* More particularly the present invention relates to 
measuring and/or analysing usage of resources on a network using data, sources 
retrieved from actions performed by users of the resources, such measurement 
10 and/or analysis providing information about resources that do not have available 
statistics, (such as site centric measurements) and combining them with site centric 
data to create a more accurate whole of market picture or components thereof , 

BACKGROUND ART 

15 

In the light of high penetmiion of Intemet use and the rapid growth of the on- 
hne industry, there has become a need for an accurate and independent Internet site 
rating service. Such a service should provide on-line industry users and 
organisations and other interested parties with a precise vehicle with which to assess 
20 vital Intemet site traffic dynamics. For example, it would be a:dvantageous for such 
users and-organisations-to-have an accurate picture-of the inforriiation-that^Intemet- 
users were viewing on and interacting with particular websites, as well as the range 
of sites that target markets were visiting, the advertisements being viewed and how 
particular sites compared statistically with competitor sites* This type of 
25 conunercial information is invaluable to those in the on-line industry wishing to 
properly target their markets and also focus their on-line presence. 

Furthermore, to date there has been no product or service for the on-line 
industry users and organisations that provides a total market rating system that uses 
site centric measurements, such as proxy and server log files, browser based 
30 measurements, and user centric measurements, such as panel data and sample 
survey data. Furthermore, site and user centric measurements have not been used to 
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collect data relating statistics pertaining to , for example, a website tiiat has no site 
centric measurement data available. By providing the sites witih such informatipn it 
provides a more accurate picture about the Internet population and which sites the 
population use or visit regardless of whether the site ceutiic measurements are 
5 available or not for a particular site, 

A syndicated multi media marketing data base has been used in Australia 
which integrates consumer demographics, product usage and media consumption for 
value-added marketing and media solutions. The data base enables advertising 
planners, buyers and users to target their advertising campaigns and to plan and 

10 evaluate integrated media campaigns based on the only official buying and selling 
currencies for mainstream Australian media. The data base utiUses the strengths of 
the media industries most widely used research tools such as TV ratings data, radio 
ratings data, readership surveys and service usage questionnaires. Each reporting 
period the operator of this data base uses a combination of data to integrate TV 

15 viewing data, updated each period, at the program level into a respondent single 
source data set which may comprise up to say 40,000 respondents. This method is 
used as a more integrated method of producing data sets capable of cross- 
referencing television with other media and consumption variables. This approach 
allows viewing information from the audited televisiori ratings to be analysed 

20 against usage, consumption and other media information. The television data base 

^is^freshed periodically-so that the most current-television program data is available—— 

with ratings consistent with the operator of the data base. 

The abovementioned system does not allow the "fusion" of one data source 
created from measuring interactions of a sample of users in relation to their use of 

25 the resources, for example use of the internet, and a further source of data pertaining 
to interactions provided by all users of the resource, measured from for example a 
website, or viewers of a program measured by a television station to obtain accurate 
estimates of traffic densities at for example a particular website or television 
program where the particular website or television station does not have the further 

30 source available. 
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Known measurement techniques include that of a server log file analysis. In 
this method a log file is kept on the server of all record files requested, IP addresses 
of those visiting the site as well as successful downloading of all resources deUyered 
from the site server. This method, however is not necessarily an accurate indication 
5 of resources used and/or viewed on the site, due to the method not being able to 
account for resources that are subsequently stored in proxy server caches or browser 
caches and are re-viewed. For example popular web pages may be stored on various 
Intemet Service Providers (ISPs) proxy servers around the world, so that the ISPs do 
not need to directly access a popular site every time a user requests access to that 
10 site. The ISP simply provides access to their stored version of the site. This enables 
the ISPs to provide a more efficient service, but results in a less accurate 
measurement service due to the inability to monitor caches. 

Similarly, once a site is accessed, site resources are saved in the user's 
browser cache, while in use. While the server log file analysis may have recorded 
15 data relating to the accessed resources at the time they were accessed, if the user 
then retvuns to one or more pages, such as by hitting the "back" button on their 
browser, then the resource being returned to is typically accessed from their browser 
cache, so that once again this page request is not recorded by the server log file. 

Another method used by some organisations is the so-called browser based 
20 measurement approach. In this method, software monitors site resources as they are 
- „_vie_wed_within a browser. JThis software monitors^thc-user-s actions when-accessing- — 
the Intemet. While this approach does not suffer the accuracy problems of server 
log file analysis, a problem that does exist with this approach is that for a complete 
market analysis all sites need to be willing to agree to install the measurement code 
25 on every site page. In practice, it has proven quite difficult to obtain cooperation 
with all sites. 

In another method, also used by some organisations, Intemet users are 
recreited and their individual usage of the Intemet is monitored to be used in 
statistical analysis. Usage is monitored by installing hardware and/or software on 
30 the user's computer. This hardware or software is not transparent for the user and is 
often quite onerous, requiring the user to log the software on each time they use it 
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An example of this method is provided in US Patent No. 5,675,510, where 
personal computer use is measured through the use of a hardware box physically 
located on the user's computer. This hardware records log files of Internet access 
by the user. This process is expensive due to the hardware costs, installation costs 
5 and maintenance and support costs. Furthermore, the process is quite obtrasive, as 
the users are very conscious of the tracking as they see the box every time they use 
their PC. Furtheraiore, the process does not track access of monitored users where 
for example, a monitored user accesses the internet at a location other than at the 
user's home or work. Examples of location that are not monitored are cyber cafds, 
10 educational facilities, friend's homes etc. 

There is considered to be a need for an alternative measurement approach 
that provides accurate results and also has improved transparency for the user. 

SUMMARY OF THE ^^fVENTION 

15 

According to a first aspect of the invention there is provided a method of 

measuring and analysing multiple data sources over a communications network in 

order to ascertain information about the use of one or more resources linked to said 

communications network, said method comprising the steps of: 
20 obtaining a data source for a first ^oup of one or more nionitored resources, 

said first group linked to said communications network;— . __ l_ 

obtaining a further data source for a second group of one or more monitored 

resources or a group of monitored users, each of said second group and said group 

of monitored users linked to said communications network and 
25 combining said data source and said further data source to form a single data 

source available to interested parties so as to ascertain usage information on one or 

more resources. 

The combining step may include one or more of displaying, aggregating, 
transforming, calibrating or formatting said single data source via a reporting server 
30 means through said communications network. 
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According to a second aspect of the invention there is provided a system for 
measuring and analysing multiple data sources over a commimications network in 
order to ascertain information about the use of one or inpre resources linked to said 
communications network, said system comprising: 
5 a first group of one or more monitored resources, comprising resource 

servers; 

a second group of one or more monitored resources, comprising resource 
servers, 

a data collection and processing means for receiving a data source for said 
10 first group of one or more monitored resources, and for receiving a further data 
source for said second group of one or more monitored resouj'ces; and 

reporting means for displaying said data source and said further data source 
as a combined data source to interested parties so as to ascertain usage information 
on one or more resources. 
15 According to a third aspect of the invention there is provided a system for 

measuring and analysing multiple data sources over a conununications network in 
order to ascertain information about the use of one or more resources linked to said 
communications network, said system comprising: 

a first group of one or more monitored resources, comprising resource 
20 servers; 

a second group-of-one or more monitored users, comprising resource^ervers;— 

a data collection and processing means for receiving a data source for said 

first group of one or more monitored resources, and for receiving a further data 

source for said second group of one or more monitored users; and 
25 reporting means for displaying said data source and said further data source 

as a combined data source to interested parties so as to ascertain usage information 

on one or more resources- 
According to a fourth aspect of the invention there is provided a network 

enabling intemet access by a user computer, characterised in that a connection 
30 means on the user computer may be set to enable coimection between a proxy server 

and the user computer such that the proxy server is communicably coupled between 
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the connection means on the user computer and any internet site servers in order to 
monitor the internet usage of the user. 

In this regard, the expression "connection means" is taken to refer to the 
means by which a user is provided with internet access, such as an internet browser. 
5 Additionally, the user computer may be any means capable of receiving and 
displaying information from the internet, such as a set-top internet teniiinal. 

According to a fifth aspect of the invention, there is provided a method of 
enabling research in a communications network having at least one user computer 
with an internet browser, the method comprising the step of: 
10 altering a proxy setdng of the browser of the user's computer to divert the 

user computer's internet access through a proxy server. 

Therefore, by making a small change to the setting of a user* s connection 
means/network browser at only one point in time, it is possible to analyse the user's 
network usage, without the need for installing any software, impacting on user time 
15 or diverting their attention. This method also is able to overcome ttie measinrement 
problems pertaining to resources stored in caches. 

According to a sixth aspect of the invention there is provided an apparatus for 
measuring usage of internet resources, comprising: 

a proxy server in communicable relation with a user browser, the 
20 communicable relation effected via a proxy setting of the browser, such that the user 
— - browser is capable of-accessing-at-Ieast one intemet resource via the-proxy server,- 
and the proxy server is capable of initiating usage measurement of the resoim^ 
accessed. 

According to a seventh aspect of the invention there is provided a method of 
25 measuring usage of internet resources comprising the steps of: 

enabling a user's browser proxy setting to reference the location of a proxy 

server; 

receiving an internet resource request at the proxy server from the user's 
browser; 

30 forwarding the resource request to a resource server to obtain the requested 

resource; 
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receiving the requested resource at the proxy server from the resource server; 

and 

passing the requested resource to the user's browser after the insertion of a 
measurement code to monitor the usage of the requested resource. 
5 And finally according to an eighth aspect of the invention there is provided a 

system for measuring and analysing multiple data sources over a conuniinications 
network in order to ascertain information about the use of one or more resources 
Unked to said conununications network, said system comprising: 
a pluraUty of resource servers; 
10 an insertion server linking each resource server of said plurality of resource 

servers to said communications network; 

such that when a request for a monitored resource from any one of said 
resource servers is raade» measurement code is inserted into said requested 
monitored resource by said insertion server for the purposes of measuring and 
15 analysing usage of the monitored resource. 

BRIEF DESCRIPTION OF THE DRAWINGS 

The invention will be hereinafter described in one or more preferred 
20 embodiments with reference to the accompanying drawings, wherein: 
Figure_l _is_a_schematic diagram of_a_system_for measuring-and analysing-data- 



from data sources according to a first embodiment of the invention, particularly in 
relation to the use over the intemet; 

Figure 2 is a schematic diagram of a system for measuring and analysing data 
25 from data sources according to a second embodiment of the invention, particularly 
in relation to accessing resources from WAP-enabled user interface devices; 

Hgure 3 is a schematic diagram of a system for measuring and analysing data 
from data sources according to a third embodiment of the invention, particularly in 
relation to using a digital television network. 
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Figure 4(a) is a schematic flow diagram showing the processes invblved in 
measuring and obtaining various data sources generally in accordance with the 
invention; 

Figure 4(b) is a schematic flow diagram showing the processes involved in 
5 measuring, obtaining and processing various data sources and applying results to 
extract data on unmonitored sources; 

Figures 5(a) and 5(b) are block diagrams showing the processes involved 
with data when access is made to an unmonitored resource and a monitored 
resource; 

10 Figure 6(a) is a schematic diagram of a system for measuring and analysing 

data from data sources according to a further embodiment of the invention using a 
proxy server; 

Figure 6(b) is a schematic diagram of a system for measuring and analysing 
data from data sources according to another embodiment using an independent 
15 server; and 

Figure 7 is a schematic diagram showing the processes involved in accessing 
a resource via a proxy server* 

DETAILED DESCRIPTION 

20 

Shown _in_Figure— l„is_-a system-^l_used -to measure and analyse-data-in ^ 

accordance with the present invention. Various users, having user interface means 
10, 12, 14 and 16, are linked to a communications network 18 which also has links 
to various resource servers 2, 4, 6 and 8 through which the users can access 

25 resources. There is generally a plurality of user interface means which may include 
but is not limited to the following group: PCs, handheld devices such as mobile 
telephones or palmtops, television receivers or monitors or any user interface device 
capable or having information entered into, interacted with or viewed by the user. 
There may be a plurality of resource servers of which 2 to 8 are examples. The 

30 communications network 18 may be the Internet or a digital or analog television 
network or any circuit-switched or packet-switched network. 
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The embodiment shown in Figure 1 will be described with particular 
reference to the Intemet and the measurement and analysis of data sources from 
monitored resource servers, for example servers 2 and 4, unmonitored resource 
servers, say servers 6 and 8, and browsers installed on the user interface means. One 
5 data source is measured using monitored resources. This data source may comprise 
site centric measurements such as census data or audit data, proxy or server log 
files, implemented using Java, JavaScipt or CGL The resources may be any one of 
a web page (to measure the number of accesses to the web page), time spent on a 
web site or web page, page impressions or a feature of a web page or web site that 

10 is interacted with by one or more users or offers the option of a response to the 
users. Resource owners agree to have their resotirces monitored to determine more 
information about the behaviour of users who access diese resources. Measurement 
code, a form of program code, is embedded in for example every web page or 
embedded in each resource to be monitored. Every time a user accesses the 

15 monitored resource the measxirement code in the downloaded resource records and 
collects information on that user and all such recordings for all users who access 
that resource are forwarded to a data collection and processing means 20, Every 
page could physically have code embedded therein or be dynamically inserted by 
another component, such as a separate server 130 (as shown in Figure 6(b)), on its 

20 path to the user. Specifically the data source from servers 2 and 4 is received by a 

first collect ion server means comprising. one_j>r-more.collection_seryers-22,-24, -The 

abovementioned data source is then forwarded to a processing server means 30 for 
processing, formatting, etc and thereafter stored in data storage means 35. The 
stored data may then be accessed by a reporting server means 34 such that the data 

25 is displayed or manipulated in some manner when accessed by Interested parties 
through the network 18. All of the server means 22, 24, 26, 28, 30, 32, 34 and 35 
form part of a data collection and processing means 20 and each of the tasks 
performed by these server means may be performed by one individual server or a 
group of servers. For example, the collection servers and processing servers may be 

30 one and the same server. 
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With reference to Figure 6(b) an insertion server 130 may be used to forward 
all requests between the resource servers 2, 4 and 6 and the users 10/12 and 14. 
Measurement code can be inserted by the insertion server 130 into each monitored 
resource requested from a respective resource server. 
S A further data source may be measured and analysed from a group of one or 

more monitored or participating users. A random sample of monitored users is 
recruited to form a panel from whom their interactions are measured and recorded in 
terms of accessing monitored and unmonitored resources, at the res^ource servers, 
via each user's browser, indicated by "B" in figure 1. The monitored users give 

10 their permission to allow the monitoring and tracking of their actions or interactions 
and provide their personal details, such as where they live (region), sex, age, 
income, home or business user. Reliable statistics on Internet population data 
collected is used to determine preset demographic quotas for the recraitment 
process. These users should be demographically representative of the preset quotas, 

15 according to such criteria as age, sex, income and whetiier or not the user is a 
business user or a private user. 

The further data source may comprise user centric measurements including 
panel data, sample data, survey data. Each monitored user of the group (otherwise 
termed a "panellist") will have every page impression, web site access, or time spent 

20 on a site or page or any other characteristic measured and recorded via measurement 
_code_which_is_downloaded-together_with_the requested-resourcel4»~the-p 
browser B. For example, then, if user interface means 10 and 12 are used by 
panellists each time they access or interact with monitored resources, at servers 2 
and 4, and/or on unmonitored resources through servers 6 and 8, these are recorded 

25 by a second collection server means comprising one or more collection servers 26, 
28. Identification means is transnutted to the collection servers 26, 28 identifying 
the user, after each interaction is recorded, either through some form of 
identification means or cookies. 

Processing server means 30 and 32 respectively receive the data source and 

30 further data source to process the data. Thus processing sender 32 processes data 
forwarded to it from the second collection server means. Examples of processing 
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include aggregating or formatting the data, or calibrating the data for a particular 
purpose. One example of processing the data sources includes calibrating them for 
a particular purpose, such as calculating an error rate to determine an estimate for 
interactions, such as page impressions, for an unmonitored site for which there is no 
5 site centric data available. At this stage the received further data sources, as 
processed by the processing servers 30 and 32 and subsequently stored in storage 
means 35, may be viewed or displayed by interested parties on reporting server 34 
An example of the calibration process will hereinafter be described. 

It is to be noted that the further data source may be of the same type as the 

1 0 first mentioned data source, that is, firom monitored resources. 

Weighting may be performed to the collected data source and further data 
source in each of the collection servers 22 to 28, This is performed by the 
processing servers 30 and 32, The weighting is done to adjust for the difference in 
demographic profiles of the sample or group to the population. The population 

15 weightings are obtained from pre-established internet population statistics for a 
certain time period. This step ensures that the collected data, after the weighting 
process, is representative of the Internet population of the measured geographical 
region. To derive greater accuracy a further breakdown of the official data showing 
the Intemet population statistics may be performed into a combination of various 

20 groups or subjects. Such groups may include sex, age, current access method, 

income,- Thus the coUected-data from page-impressions-from the sample users naay- 

be tabled in terms of each of the categories mentioned above to provide a more 
accurate picture to interested groups* Furthermore, the breakdown may be in terms 
of categories relating to the types of monitored resources, for example, sport, 

25 politics, entertainment, business. 

There will be an overlap of the data source and further data source results 
where a monitored resource, having say site centric measurements available, has 
corresponding further data source results pertaining to panellists. Thus, for 
example, for a monitored web site there is panel data collected firom each of the 

30 panellists for the same monitored web site. Comparable data is therefore taken 
from the two corresponding different sources, being panel data which may pertain 
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to various interactions performed by the panelHsts, and die abovementioned site 
centric measurements. 

If, for example a panel or group of 3000 users are registered from which 
panel data is obtained, then a weighting function is applied to determine or estimate 
5 actual traffic levels for all internet users in a particular region. For example in 
Australia, there is an estimated total internet audience of 4.4 million. Weighting is 
simply applied as a multiplication factor which brings the representative sample in 
line with the total traffic market trends, that is, 4,400,000/3000 = 1466.7. AU 
unique visitor numbers for sites or page impressions are multiplied or weighted by 

10 this factor in order to estimate the actual traffic levels. 

Of the 3000 users who are taking part in the panel, say 2000 users, visit a 
monitored web site (resource) from server 2 or perform particular interactions on 
that web site which has corresponding site centric measurements output available, 
and another 2500 panellists visit a web site that is not monitored, say at server 8. As 

15 the other web site is not monitored then there is no site centric measurement data 
available and so to estimate the total traffic or users that would access the other web 
site or perform particular interactions on that web site or on a web page of that web 
site, the following occiu^. 

The 2000 users who have accessed the web site that is monitored, at server 2 

20 is scaled up in accordance with the internet. Thus, we arrive at a figure of the total 

number„of_the- internet-audience being 4,400,000,_divided-by— the-^number— of 

panelUsts taking part in the sample, being 3000, and multiply this by 2000, which 
represents the number of panellists estimated to have actually visited that site. This 
results in an expected 2, 933,333.3 users in the internet population to visit this site 

25 over the predefined period. This is the ideal situation where we would expect the 
numbers obtained, after scaling up, and the actual site centric measurements to 
correspond exactly. Equivalently, the number of users in the internet audience you 
would expect to visit the unmonitored site, at server 8 is 4,400,000/3000 x 2500 = 3, 
666,666.6 visits. 

30 However, inherent in the sampling there are expected to be deviations and 

therefore calibration in terms of an error rate is introduced, being the ratio of the site 
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centric measurements to that of the equivalent panellist metrics. Separate metrics 
may be used to improve accuracy, such as one for page impressions, advertisement 
views, unique visitors, or other traffic measurements or other resource metrics. 
Each of the error rates are derived for the metrics for the particular period under 
5 review. 

Thus, for the above example, if the actual census data for the number of visits 
to the monitored web site is 3,200,000, then the actual deviation is 
3,200,000/2,933333,3 which provides a ratio of 1.0909 so that the sample has an 
error rate of a factor of 0,0909. This ratio of 1.0909 is then multiplied by the 

10 derived figure above (3,666,666.6) for the site that is not monitored which is 
equivalent to 4,000,000 visits or use of the attributes. 

The above derived example related to using only one monitored site. 
However, similar or other techniques can be applied on a group of resources, such a 
number of web sites or advertising page impressions. Furthermore different metrics, 

15 based on different requirements may need altemative calibrations, such metrics 
including page inq)ressions, unique visitors or time measurement- The calibration 
may be based on two data sources or more than two data sources, whether they be 
from monitored or unmonitored resources. 

Thus, by using the above method, sites that axe not monitored can have 

20 additional data available to them to estimate the amount of traffic which provides an 
mvaluable_„r^ssourc.e„to_interested parties_to_specifically target^users-in respect^of— - 
various activities or interactions that they have undergone in accessing a particular 
web site. Furthermore, it provides additional information to owners of monitored 
web sites as to how many visits or interactions/responses unmonitored web sites 

25 (being potential competitors to such owners) have had from the internet audience, 
based on the two or more sources of data, from the site centric measurements and/or 
from the user centric measurements, or simply based on the site centric 
measurements. Thus more information is available about the behaviour of the 
intemet population or audience. 

30 In the abovementioned process, in order to produce comparable data, sites 

having site centric data collected are grouped into the same grouping of sites which 
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is made in the user centric data. Thereafter, the same groupings of URLs in the site 
centric and user centric groups are then formed. Naturally, the bigger this group 
accounts for in terms of the number of monitored resources or page impressions for 
example, the more accurate the end results are expected to be. 
5 With reference to Figure 2, various users may have user interface means that 

are for example WAP-enabled processors such as a mobile telephone 40, linked to a 
cellular network 46 which in tum is linked to the internet 18 through a WAP 
gateway 48. Each of the WAP-enabled devices use the Wireless Mark-up Language 
(WML). Accesses to or interactions widi a monitored resource are recorded by 

10 embedding measurement code in the monitored resource and this is forwarded to 
collection servers 22, 24 of the data collection and processing means 20 as in the 
previous example. Those users forming part of a panel or group have corresponding 
interactions monitored whereby measurement code, as mentioned earlier, is 
downloaded with the requested resource(s) into the WAP-enabled devices to 

15 monitor the actions or interactions of each of the users of the devices. Each of the 
interactions of the users are monitored and recorded by the measurement code for 
corresponding interactions on monitored and umnonitored resources and forwarded 
to collection servers 26, 28 of the data collection and processing means 20/ 
Information in the form of reports may be displayed to interested parties after 

20 combining the two separate sources of data, as processed by processing servers 30 

and 32,^n^eporting server 34.~Againr^this-follows the forwar^g of the processed— 

data from the processing servers 30, 32 to data storage means 35 which is accessed 
by the reporting server 34. Of particular interest, information about interactions of 
the various users for sites that are not monitored is available by previously 

25 mentioned calibration techniques. 

The above principles are easily adapted to Web television, whereby each of 
the devices 10, 12, 14 or 16 are television receivers such that users are monitored in 
terms of their responses or choices of options regarding a particular television 
program or television commercial. Thus there are a number of sample TV users 

30 having respective television receivers accessing the internet and are monitored in 
tenns of their responses or interactions on a particular resource server by the 
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abovementioned measurement code accompanying each of the resources being 
downloaded to each of the users* devices. For various resources the site centric 
measurement data is already available and there will be some resources that overlap 
with the recorded user centric data. Thus information pertaining to various 
5 interactions or actions by many users are obtainable for otiier sites fliat are not 
monitored which thereby provides a good comparison of resource usage, for 
example, of various web sites to interested parties. 

With reference to Figure 3, there is shown a digital television network 50 to 
which are linked various television receivers 10 12 and 14 that have users who have 

10 agreed to be part of a survey for monitoring their responses or actions for a 
particular resource such as a television program or commercial. There are also other 
users 54 who have digital television receivers for which no monitoring is conducted. 
A television station resource server 2 will transmit various programs or 
advertisements over the digital network 50 and for some programs will already have 

15 site centric measurement data, such as audit or census data available on all of the 
users (users on receivers 10, 12, 14 and 54) who view the program and make 
responses or actions where required regarding that program or advertisement. 
Again, a sample of users is used to obtain panel data by measurement means to each 
of the sample user receivers 10, 12 and 14 that can track their movements and 

20 actions in relation to other unmonitored corresponding advertisements or other 

^unmonitored_programsJrom_ different . teleyision_networks,-SUch-as _TV_station ^ 

resource servers 4 and 6, as well as the monitored programs from TV station 
resource server 2. All this information can be used to calibrate or check panel data 
for which interested parties such as users in the on-line industry or those in the 

25 television industry can receive reports through reporting server 34 on popular 
programs that are watched or advertisements that are responded to across the whole 
digital television user population. The abovementioned error rate is also applicable 
in determining numbers of users who respond to or interact in other ways with 
resources that are not monitored. The data collection and processing means 20 

30 includes the first and second collection server means 22 to 28, processing servers 
30 and 32, data storage means 35 and reporting server 34 to undertake similar tasks 



16/02 01 FRI 16:55 FAX 61 3 9288 1567 FREEH ILLS CARTER SMITH B 0017 

> • » . • - ' • ■■...,'•..'*■• 

-16- ■ 

as mentioned with respect to the embodiments described in Figures 1 and 2. Each 
of the collection servers and processing servers and the storage means 35 and 
reporting server 34 may be separate servers or function as one server unit. The data 
collection and processing means 20 may form part of a television station for 
5 calculating and collecting the data and the calibration applicable to unmonitored 
resources, such as other programs. 

The medium in which the two data sources are obtained need not be the 
same. For example site centric measurement data may be obtained for internet 
based resources and be compared with or correlated with user centric measurement 
10 data for Web TV users or digital television users. 

With reference to Figure 4(a), there is shown a number of steps used by the 
method and system of this invention in respect of any medium and resources 
thereof. Firstly, at step 60, a resource such as a web site or web page or television 
program, is monitored to determine and record all interactions with or accesses with 
15 the resource by all users having access to the resource. 

A data source, such as site centric data, is obtained for one or more 
interactions at step 62 from all users who interact in some way with the monitored 
resource. This is recorded and collected by the collection servers 22,24 of the data 
collection and processing means 20. By way of example, the number of visits to a 
20 particular web page has may be recorded. 

After. ^tablishing_a_panel_or group_of_users, at step- -64_these-users-are - 

monitored for their interactions and at step 66 a second data source, such as panel 
data or any other form of data, is measured, recorded and collected by collection 
servers 26,28 of the data collection and processing means 20. The panel data may 
25 comprise for example page impressions or the number of visits each paneUis t has for 
the monitored resource, such as a web site and every unmonitored resource. At step 
68, the two sets of data sources may be viewed, combined or otherwise customised 
on server 34, 

In Figure 4(a) two or more data sources may be used for analysis, whether 
30 they originate from the same type of source, for example, monitored resources, or 
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from different types of sources, for example, one or more monitored resources 
and/or one or more unmonitored resources from monitored users. 

In Figure 4(b), at step 70 resources are monitored^ such as a web site access 
or page impression by all users and at step 72 a data source relating to site centric 
5 measurements is obtained for such web site accesses or page impressions. At step 
74, pre-selected users are monitored for their interactions, corresponding to the 
monitored resource, for example the same web site accesses or page impressions 
and equivalently those same interactions of an unmonitored resource or resources. 
At step 76, the further data source relating to the above user centric measurements is 

10 obtained and forwarded to the collection servers 26/28. Collection servers 22, 24 
will have the data from step 72. Processing servers 30, 32 calculate or calibrate an 
error rate based on the two sets of data at step 78 after scaling up has taken place 
and then at step 80 the error rate is applied to the unmonitored resource(s) from 
steps 74 and 76. Reports on the results may be displayed on server 34 for access by 

15 users in the on-line industry after the processed data has been transferred to the 
storage means 35 from the processing servers 30, 32. 

In Rgure 5(a) there is shown processes used when a user's browser requests 
a monitored resource. The browser 80 first of all sends a request (81) for a 
monitored resource from resource server 82. The resource is sent back (83) from 

20 the server 82 to the browser 80 with measurement code which was originally 
embedded- The measuremen t code monit ors and c ollects information on the u<gag& 
of the resource by the user and at (84) a record of this Is sent to the respective 
collection server(s) 85. Thereafter the record can be processed together with other 
user or site centric measurements by the respective processing server(s) 87. Where 

25 the user is a panellist, the measurement code would already have been sent to tihe 
panellist's browser and the interactions associated with a monitored or tmmonitored 
resource recorded and sent to the respective collection server<s) 85. Thereafter it is 
processed by processing server(s) 87 and forwarded for storage to the data storage 
means 88. Reports may then be generated from the data storage means 88 to the 

30 reporting server 89. 
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In Figure 5(b) there is shown process steps when a panellist requests an 
unmonitored resource. The user*s browser 80 makes a request (91) for the 
unmonitored resource at the resource server 90 through the proxy server 100 which 
returns (92) the request to the browser 80 via the proxy server lOO. The proxy 
5 server 100 inserts the measurement code into the requested resource before 
forwarding the request to the browser 80, Then the measurement code monitors and 
collects information on the usage of the unmonitored resource and forwards this at 
(93) to the collection server(s) 85, which is then collated as user centric 
measurement data. It may then be processed by processing server 87 and forwarded 
10 on to reporting server 89 via the storage means 88 as previously described. It is to 
be noted that the collection server(s) 85 may also be one and the same server as the 
proxy server 100. 

Rather than obtaining measurements through browsers, or equivalently some 
program means loaded onto a user interface device, specific software may be loaded 
15 onto the devices 10, 12, 14 or a "hardware" box may be attached to the devices so 
that the user may be aware that he or she is being monitored. Alternatively, a proxy 
server may be used. 

Where a proxy server is used, it is invisible to the user and enables an 
organisation or interested parties to monitor the internet usage of the panel member 
20 as an alternative to installing software or firmware onto the panel member's user 

mterface. An advantage of the transparency of this tracing technique-is-that-iti- 

promotes panel continuity. 

In accordance with a further embodiment and with reference to Figure 6(a), 
resource requests and responses between user interface devices 10, 12 and 14 and 
25 the resource servers 2, 4 and 6 go through a proxy server 100. The proxy server 
may form part of the data collection and processing means 20. 

Once a user has agreed to become a panel member, the user is instmcted to 
change his or her browser setting to access the internet via the proxy server 100. If 
the user has trouble in effecting this set-up, they may e-mail a helpdesk provided by 
30 the organisation or access a call centre via telephone. 
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Examples of the manual proxy set-up process will now be described with 
reference to some existing Internet browsers* 

If the user has Internet Explorer 4.0 or 5.0, to divert flieir internet access 
through a proxy server, they would be required to select "Intemet Options" from 
5 their **View" menu, then "Coimection Folder*', followed by "Access the Internet 
using a proxy server". In the address entry box, they would enter the address of the 
proxy server, which would be provided to them by the research organisation. 

Alternatively, if the user had Netscape 4.0, they would be required to select 
*Treferences'* in the "Edit" menu of their browser, followed by "Advanced", 
10 "Proxies", "Manual Proxy Configuration" and ^'View". In the http: entry box they 
would then be required to enter the address of ^e proxy server, as provided by the 
party initiating the network measurement. 

As an alternative to the manual set-up process, a software program may be 
used to effect the browser setting change: for example, the user could click on a 
15 link, and the link would then implement the change. 

With reference to Figure 7, when a user requests a resource on their browser 
110, the request first goes (112) to the proxy server 100. The request is then 
forwarded (114) by the proxy server 100 to the corresponding resource server 116. 
The resource server passes (118) the requested resource to the proxy server 100 and 
20 from there (120) the measurement code is embedded in the requested resource, at 
the proxy server 100 before it goes back tCLttieus_er^sJ>rowser_110_t^ 
interactions of the user, A record of this request is then sent (122) to the collection 
server of the data collection and processing means 20 for processing as part of the 
data source, where site centric measurements are collected for this particular user 
23 and other users in respect of similar resource requests. If the data relates to the 
further data source whether monitored or unmonitored, for example panel data of a 
panellist, like user centric measurements, then this procedure is repeated but the data 
is collected by the respective collection server 26, 28. It is to be noted that the 
collection servers 22 to 28 and the proxy server 100 may be the same server. 
30 Thus, for some monitored resources there will be an overlap of site and user 

centric measurements for which data may be displayed separately or combined on 
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reporting server 34. Alternatively an estimate of traffic data can be determined for 
those unmonitored resources having no site centric measurements available, using 
the aforementioned techniques. 

When the access request is diverted to the proxy server 100^ the panel 
5 member is able to be identified by virtue of an identification means such as user ID 
or a imique cookie assigned to the member during the sign up process. A cookie is a 
feature of the internet protocol Hypertext Transfer Protocol (HTTP), which is 
essentially a unique identifier stored on the user's computer. 

During the processing of the data it is possible to check for any anomalous 
10 usage of sites (eg. One user visiting a particular site fifty times in one day), that may 
not be representative of the overall sample of panellists. If it finds anomalies like 
this, the particular data may then be disregarded. 

When recording interactions of a panel of users at the data collection and 
processing means 20, a view of internet usage by the "panel population" is able to 
15 be obtained. The data obtained via this panel approach may be used in isolation to 
obtain relevant statistics. Alternatively, as previously mentioned, a fusion of the 
panel data with site centric measurement data such as from browser based data or 
proxy or server logs may be used. In this alternative way, it is possible to fill the 
reporting properties or interactions of resources for which accurate site centric 
20 measurement data is not available, in order to improve the overall market 

meas urement accuracy. 

The user details should be periodically validated, so from time to time the 
users should be contacted to confirm participation and verify personal details. 

Variations and additions are possible within the general inventive concept as 
25 will be apparent to those skilled in the art, In particular, if a user's browser or 
interface device does not support Java, alternative approaches for obtaining 
measurement data are possible and within the inventive concept, such as via GGI 
(Conunon Gateway Interface) measurement. 



